The myelin water imaging transcriptome: myelin water fraction regionally varies with oligodendrocyte-specific gene expression

Identifying sensitive and specific measures that can quantify myelin are instrumental in characterizing microstructural changes in neurological conditions. Neuroimaging transcriptomics is emerging as a valuable technique in this regard, offering insights into the molecular basis of promising candidates for myelin quantification, such as myelin water fraction (MWF). We aimed to demonstrate the utility of neuroimaging transcriptomics by validating MWF as a myelin measure. We utilized data from a normative MWF brain atlas, comprised of 50 healthy subjects (mean age = 25 years, range = 17–42 years) scanned at 3 Tesla. Magnetic resonance imaging data included myelin water imaging to extract MWF and T1 anatomical scans for image registration and segmentation. We investigated the inter-regional distributions of gene expression data from the Allen Human Brain Atlas in conjunction with inter-regional MWF distribution patterns. Pearson correlations were used to identify genes with expression profiles mirroring MWF. The Single Cell Type Atlas from the Human Protein Atlas was leveraged to classify genes into gene sets with high cell type specificity, and a control gene set with low cell type specificity. Then, we compared the Pearson correlation coefficients for each gene set to determine if cell type-specific gene expression signatures correlate with MWF. Pearson correlation coefficients between MWF and gene expression for oligodendrocytes and adipocytes were significantly higher than for the control gene set, whereas correlations between MWF and inhibitory/excitatory neurons were significantly lower. Our approach in integrating transcriptomics with neuroimaging measures supports an emerging technique for understanding and validating MRI-derived markers such as MWF. Supplementary Information The online version contains supplementary material available at 10.1186/s13041-024-01115-4.

The myelin water imaging transcriptome: myelin water fraction regionally varies with oligodendrocyte-specific gene expression Introduction Myelin facilitates the rapid propagation of action potentials in the central nervous system.Pathological changes in myelin are associated with numerous disease states, including multiple sclerosis, giving rise to the need for objective quantification markers [1,2].To this end, a variety of "myelin" neuroimaging measures have been proposed [3][4][5][6][7][8][9].
Myelin water fraction (MWF) has emerged as a promising magnetic resonance imaging (MRI) metric for assessing demyelination in multiple sclerosis pathology.MWF quantifies water trapped between myelin lipid bilayers relative to water in other spaces.The validation of MWF as a specific and sensitive marker for myelin is supported by post-mortem studies with histological measures of myelin density [10][11][12].A recent neuroimaging transcriptomics study in young men has also revealed that MWF determined using one particular type of data acquisition method is associated with myelin gene expression in predominantly gray matter regions across the cerebral cortex [13].
Neuroimaging transcriptomics is a new and emerging field of research that combines transcriptomics with neuroimaging to gain insights into the cellular and molecular underpinnings of brain function and anatomy [14].The basic premise of the approach is that genes with expression patterns matching a neuroimaging feature represent the molecular signature of that feature.In healthy individuals, MWF varies significantly across white matter brain regions [15], which should, in theory, correspond with the expression pattern of oligodendrocyte-associated genes, given the key role of oligodendrocytes in myelin production and maintenance.
The aim of the current study was to examine the spatial relationship between MWF and gene expression in white matter tracts in the healthy adult human brain.We hypothesized that regional variations in MWF will be positively and uniquely associated with myelin-related gene expression.

Results and discussion
An overview of the methodological pipeline is illustrated in Fig. 1.Of the 11,074 genes initially included in our analysis, 1,257 matched to a Human Protein Atlas brain "cell type-enriched" or "low cell type specificity" gene set.The mean MWF-low cell type specificity gene set pairing (r-value) was 0.13, with a standard deviation of ± 0.43 (n = 1,094 genes).Seven cell type-enriched gene sets, with n > 5 genes, were included in our final statistical analysis: adipocytes (n = 8 genes), Muller glial cells (n = 11 genes), oligodendrocyte precursor cells (n = 13 genes), inhibitory neurons (n = 14 genes), excitatory neurons (n = 18 genes), astrocytes (n = 20 genes), and oligodendrocytes (n = 65 genes).Non-parametric statistical testing revealed significant differences across gene sets (Fig. 2A; Kruskal-Wallis: test statistic = 155.0,df = 7, p < 0.05).Compared to a low cell type specificity gene set (control; n = 1,094), MWF-gene set pairings for oligodendrocytes and adipocytes were significantly higher, while inhibitory and excitatory neurons were significantly lower (Table 1).The MWF-gene set pairings for oligodendrocytes and adipocytes were also significantly higher compared to astrocytes, Muller glial cells, and oligodendrocyte precursor cells (Supplementary Table 1).There were no significant differences between MWF-astrocytes, MWF-Muller glial cells, and MWF-oligodendrocyte precursor cell gene set pairings and the MWF-low cell type specificity (control) gene set pairing.
Aligning with our hypothesis, the MWF-oligodendrocyte gene set pairing was: (a) positive and (b) significantly higher than the pairing for control genes (i.e., the low cell type specificity gene set), in addition to all other Step 3 extracts myelin water fraction (MWF) and gene expression data from regions of interest (ROIs) using the Abagen toolbox.Step 4 generates inter-regional pairwise comparison matrices for MWF and gene expression.In these matrices, each cell represents the Cohen's d effect size between two ROIs.Step 5 correlates MWF with gene expression pairwise comparison matrices using Pearson correlation.Finally, Step 6 compares cell type-specific data referencing the Human Protein Atlas (HPA).Select images in this figure were sourced from BioRender and Canva [16][17][18][19][20] gene set pairings.The opposite was the case for MWFinhibitory and MWF-excitatory neuron gene set pairings, which were: (a) negative and (b) significantly lower than the pairing for control genes, suggesting that areas with higher gray matter content are associated with lower MWF.MWF-astrocyte, MWF-Muller glial cell, and MWF-oligodendrocyte precursor cell gene pairings were no different than the pairing for the control gene set.Interestingly, the MWF-adipocyte gene set pairing was also significantly higher than the control pairing -a novel observation that may reflect adipocytes as the main constituent of the myelin lipid bilayer [21,22].A sensitivity analysis performed at 0 mm sample-to-region matching tolerance (i.e., all microarray samples derived strictly from within the region of interest; Supplementary Fig. 2) confirmed our primary observations, with oligodendrocytes and adipocytes ranked the most strongly correlated with MWF (Fig. 2B; Table 1).Genes that were of low cell type specificity, as well as astrocytes, increased their correlation with MWF at 0 mm compared to 2 mm Table 1 Comparison of RNA single cell types to a control group.Each row evaluates whether MWF-gene pairings in high cell type specificity genes sets differ from the MWF-control gene set pairing, using the Mann-Whitney U statistic.The control set consisted of genes identified as having low cell type specificity in the Human Protein Atlas and were present in roughly similar levels across all cell types.Adjustments for multiple comparisons were made using the Bonferroni correction method.*: p < 0.05, **: p < 0.01, ***: p < 0.001, ****: p < 0.0001, *****: p is approximating 0  (Supplementary Table 2).However, low cell type specificity and astrocyte genes were still commonly negatively associated with MWF, which was rarely the case for oligodendrocytes or adipocytes.
To the best of our knowledge, we are the first to have employed a neuroimaging transcriptomics approach to understand spatial variations in MWF in white matter tracts.One earlier study examined mixed white and gray matter regions [14], where it is possible that associations between gene expression and MWF could reflect, to a large degree, differences in gray matter composition, where myelin levels are ~ 10x lower than white matter.The focus on mixed gray and white matter regions may also explain the significant negative MWF-astrocyte gene set pairing in another previous study [13].Moreover, substantial differences across white matter tracts have been previously reported for MWF, with nearly a two-fold difference between the internal capsule and the corpus collosum [15].Based on our results, these MWF variations reflect normal phenotypic variations in white matter gene expression in the brain.Such conclusion can be drawn from the association between MWF and gene expression patterns of cells responsible for forming white matter structures, specifically oligodendrocytes and adipocytes.
In addition to examining mixed white and gray matter regions, previous neuroimaging transcriptomics studies have employed different methods of estimating MWF.The combined gradient and spin-echo (GRASE) acquisition method used in the current study is a wellestablished myelin water imaging approach, having been extensively validated as an objective marker of myelin in the human brain [23,24].Conversely, the multicomponent driven equilibrium single pulse observation of T1 and T2 (mcDESPOT) acquisition method is a newer approach, whose primary advantage is faster acquisition time and higher signal to noise ratio [25].However, as noted by West and others [13,[26][27][28], the conventional mcDESPOT fitting scheme may lead to an inherent and unpredictable bias in MWF estimation.Additionally, in the context of neuroimaging transcriptomics, it appears that mcDESPOT may produce less pronounced regional variations in MWF [29].While regional differences across white matter tracts are typically more distinct with GRASE, suggesting more variability, it seems unlikely that the MWF derived from GRASE and mcDESPOT would yield comparable results.In this regard, demonstrating an association between GRASE-derived MWF and myelin related genes (i.e., oligodendrocytes) represents important and necessary validation of MWF as a myelin measure using neuroimaging transcriptomics.
Improved acquisition of mcDESPOT, namely BMC-mcDESPOT, demonstrates greater inter-regional MWF variations [30], more similar to those observed with GRASE.Additionally, while mcDESPOT tends to report higher MWF values than GRASE, we recognize this does not necessarily imply that GRASE is more accurate.MWF is influenced by various physiological factors and pulse sequence parameters that may introduce variability to the myelin measure [28,[31][32][33].We also emphasize the importance of further histological validation of existing MRI-derived myelin measures.Specifically, validation across different age groups and use of fresh samples to avoid the biases associated with formalin fixation, which can influence MWF measurements and other nuclear magnetic resonance parameters [34,35], may be worth investigating.
Our observations are also important in that MWF was estimated in both men and women, whereas Patel and colleagues only sampled MWF from males [13].Methodologically, we incorporated a novel neuroimaging transcriptomics approach, characterizing regional variations in gene expression and MWF across white matter tracts as Cohen's d effect sizes before estimating their spatial relationship.This was primarily done to further account for donor-level differences in regional gene expression that are often overlooked in neuroimaging transcriptomics studies.Further validation of this approach, including a direct comparison with other neuroimaging transcriptomics methods using age-matched cohorts from whom imaging and transcriptomics data are sourced, is warranted.

MWF calculation
For the MWF calculation, voxel-wise T2 distributions were calculated using non-negative least squares with stimulated echo correction [36].MWF was defined as the fractional signal with T2 < 40 ms.Each participant's MWF map was first registered to their corresponding 3DT1 space and subsequently warped to the standard 1 mm MNI152 space.MWF for each subject was extracted from five ROIs parcellated from the Johns Hopkins University atlas [37].This included four white matter regions (i.e., internal capsule posterior, corticospinal tract, cerebral white matter, and corpus callosum body) and one predominantly gray matter region (hippocampus) (Fig. 3B).
The AHBA microarray dataset was preprocessed using the Abagen toolbox [39][40][41].Regional microarray expression data were extracted from the same five ROIs as for MWF from each brain donor.Supplementary Fig. 1 illustrates MNI coordinates of each microarray probe included in our analysis.Microarray probes were filtered based on their expression intensity relative to background noise, such that probes with intensity less than the background in > = 50% of samples across donors were discarded.When multiple probes indexed the expression of the same gene in an ROI, we selected the probe with the highest average correlation to other probes across all samples (i.e., intensity).Samples were assigned to a brain region if their MNI coordinates were within 2 mm of each parcellation.A sensitivity analysis was also performed at 0 mm (Supplementary Fig. 2, Fig. 2B).The MNI coordinates of tissue samples were updated to those generated via non-linear registration using Advanced Normalization Tools (ANTs) [42].Normalization of tissue sample expression values across genes using a scaled robust sigmoid function was conducted to account for individual differences in donor brain expression [43].
To ensure the representativeness of the spatial variations in gene expression across donors, we only examined genes that passed the following two conditions in our downstream analyses.First, we only retained genes with expression data from all six AHBA brain donors in each ROI to maximize the number of microarray probes sampled per region.Second, we completed a pairwise Pearson correlation analysis between donors and only retained genes with a donor-to-median correlation coefficient r > 0.4, a threshold indicative of a moderately consistent expression pattern across donors [13].A total of 11,074 genes passed the two-stage consistency filter and were used in our downstream analyses.

Neuroimaging transcriptomics analysis
The premise of our neuroimaging transcriptomics analysis was to compare the inter-regional expression profile of the AHBA genes against the empirical distribution pattern of MWF (Fig. 4).To achieve this, we first generated inter-regional pairwise comparison matrices for MWF (Fig. 4A) and each gene (Fig. 4B).In these matrices, each cell represents the Cohen's d effect size between two ROIs.The Cohen's d was chosen to enable a more representative comparability between MWF and gene expression patterns and was calculated by comparing the difference in means between two ROIs while considering the variance as given in the pooled standard deviations.We then used Pearson correlation coefficients to assess the relationship between MWF and gene expression matrices (Fig. 4C).Utilizing this methodology, we identified genes whose expression patterns mirrored the interregional variations observed in the MWF matrix, thereby suggesting their potential involvement in myelin-associated cell types.

Gene categorization using single cell expression data
We leveraged the Single Cell Type Atlas provided by the Human Protein Atlas [44] to categorize the 11,074 genes that passed the two-stage consistency filter into specific cell types.This atlas presents comprehensive expression data from single-cell RNA sequencing experiments conducted on healthy human tissues and identifies genes with elevated expression in specific single cell types.Our focus was on genes exhibiting cell type-enrichment, characterized by their expression levels being at least four times higher in one cell type compared to others [45] and genes with low cell type specificity (control).We analyzed the distribution of Pearson correlation coefficients across cell types, as shown in Fig. 2. To assess statistical differences between these cell types, we utilized the Kruskal-Wallis test, which was complemented by the Mann-Whitney U test for conducting pairwise comparisons among the investigated cell types (Supplementary Table 1

Fig. 1
Fig. 1 Overview of methodological pipeline from myelin water imaging to cell type comparison.Step 1 obtains myelin water imaging data from Liu et al. (2019) [15].Step 2 retrieves gene expression data from the Allen Human Brain Atlas (AHBA).Step 3 extracts myelin water fraction (MWF) and gene expression data from regions of interest (ROIs) using the Abagen toolbox.Step 4 generates inter-regional pairwise comparison matrices for MWF and gene expression.In these matrices, each cell represents the Cohen's d effect size between two ROIs.Step 5 correlates MWF with gene expression pairwise comparison matrices using Pearson correlation.Finally, Step 6 compares cell type-specific data referencing the Human Protein Atlas (HPA).Select images in this figure were sourced from BioRender and Canva[16][17][18][19][20]

Fig. 2
Fig. 2 Distribution of Pearson correlation coefficients across MWF-cell type pairings at 2 mm (A) and 0 mm parcellation thresholds (B).The ridgeline plots depict the distribution of Pearson correlation coefficients for each MWF-cell type pairing, indicating the strength and direction of the correlation between MWF and gene expression.The number of genes in each gene set is denoted by "n".The dashed line represents the zero-correlation mark, where values to the left suggest a negative correlation and values to the right indicate a positive correlation.Cell types range from neurons to glial cells, with a control group (low cell type specificity) included for reference.Genes that were classified as having low cell type specificity in the Human Protein Atlas were present in roughly similar levels across all cell types

Fig. 3
Fig. 3 Myelin water imaging brain atlas.A. Structural T1 in MNI152 standard space (grayscale) and corresponding myelin water fraction (MWF) maps (colour scale: 0-0.2) across 50 healthy participants (adapted from Fig. 2. Liu, H., et al. [15]).B. Regional distribution of MWF across five brain regions of interest (ROIs).The anatomical location of the ROI masks in coronal view are accompanied by corresponding raincloud plots (violin plot, boxplot, and jittered data points).Gray lines connect MWF values across ROIs for individual participants

Fig. 4
Fig. 4 Inter-regional myelin water fraction and gene expression profiles.(A) Inter-regional pairwise comparison matrix for myelin water fraction (MWF), with each cell representing the Cohen's d effect size between the five regions of interest (ROIs).(B) Inter-regional pairwise comparison matrices for gene expression patterns of three sample genes, FAM173B (expression has strong positive correlation with MWF), UBE2R2 (expression has no correlation with MWF), and GOLGA8A (expression has strong negative correlation with MWF), with Cohen's d effect size between the five ROIs.(C) Scatterplots illustrating the correlation between MWF and gene expression for the aforementioned genes.Only data from the cells above the diagonal line of zeros in the matrices were included to eliminate redundant comparisons.Each plot includes the Pearson correlation coefficient and the FDR-adjusted p-value (p adj ).Linear regression lines, represented in gray, are accompanied by shaded areas denoting confidence intervals.Abbreviations: IC = internal capsule posterior, CS = corticospinal tract, WM = cerebral white matter, CC = corpus callosum body, HC = hippocampus

High Cell Type Specificity Gene Set Test Statistic Std. Error Std. Test Statistic P value Adjusted p value
).